Handling Multiple Outcomes in Health Data

Sivakamakshi Muthu Kumarasamy, Shivadarshini Sreekanth1

1 School of Mathematical and Statistical Sciences, University of Galway

Background and Problem

The analysis of multiple outcomes has become increasingly central to health research, driven by the need to understand complex disease dynamics and multifaceted treatment responses.Health research frequently examines diverse outcomes simultaneously, including survival endpoints (such as mortality or disease progression), binary classifications (like high/low risk categories), and continuous metrics (such as quality-of-life assessments). Traditional methods like separate Cox models for survival outcomes or logistic regression for binary responses treat outcomes as independent, ignoring their biological and temporal connections. Composite endpoints (e.g., PFS2) try to summarize effects but can hide differences across outcome types.

This approach limits realistic disease modeling, where early progression affects later survival and quality of life. Joint modeling—using shared random effects or latent processes—offers a better solution but remains rare due to computational demands and lack of standard tools.

Objectives of Project

  1. To compare approaches that treat outcomes independently, via composite indices, or through multivariate frameworks, with a potential expansion to evaluate multi-task modelling.

  2. Replace Traditional Univariate methods with modern integrated approaches.

  3. Implement joint modelling techniques linking time-to-event, categorical and continuous outcomes via share disease pathway hierarchies.

Data Sources and Datasets

For the clinical trial analysis, an initial investigation will be conducted using the SECOMBIT (Sequential Combo Immuno and Target therapy) study dataset. This dataset comprises of 209 patients with metastatic melanoma and BRAF mutation assigned to a three arms prospective, randomized phase II study to evaluate the best sequential approach with combo immunotherapy (ipilimumab/nivolumab) and combo target therapy (LGX818/MEK162). The primary objective of the original study by Ascierto et al., 2023 (Ascierto et al. 2023) was to define the best sequencing combination treatment in primary efficacy variable (Overall Survival, OS) with a large number of secondary objectives including the evaluate the effects of the three sequencing combination treatments on Total Progression-Free Survival (PFS). The dataset comes from a follow up to the original study with a focus on the 4-year survival and biomarkers evaluation from the phase II SECOMBIT trial (Ascierto et al. 2024).

For the predictive modelling objective the initial investigation will be conducted on the Memorial Sloan Kettering Cancer Centre (MSKCC) prostatectomy cohort. This dataset comprises of 181 patients diagnosed with primary prostate cancer who have undergone surgical treatment (prostatectomy). This data is originally from a study by Taylor et al. 2010 (Taylor et al. 2010) which focused on the annotation of prostate cancer genomes. The dataset contains three clinically important time-to-event outcomes of interest; biochemical recurrence (BCR) free survival, metastasis free survival and overall survival.

Early Results / Descriptive Statistics of Datasets

Usually you want to have a nice table displaying some important results that you have calculated. In posterdown this is as easy as using the kable table formatting you are probably use to as per typical R Markdown formatting.

You can reference tables like so: Table ??. Some basic summaries of the dataset are below:

A
(N=69)
B
(N=71)
C
(N=69)
Overall
(N=209)
sites
0 (0%) 1 (1.4%) 1 (1.4%) 2 (1.0%)
>=3 26 (37.7%) 29 (40.8%) 26 (37.7%) 81 (38.8%)
1-2 43 (62.3%) 41 (57.7%) 42 (60.9%) 126 (60.3%)
ULN_LDH
2 (2.9%) 2 (2.8%) 1 (1.4%) 5 (2.4%)
elevated 26 (37.7%) 28 (39.4%) 20 (29.0%) 74 (35.4%)
normal 41 (59.4%) 41 (57.7%) 48 (69.6%) 130 (62.2%)
TMB
41 (59.4%) 46 (64.8%) 39 (56.5%) 126 (60.3%)
<10 20 (29.0%) 17 (23.9%) 18 (26.1%) 55 (26.3%)
>=10 8 (11.6%) 8 (11.3%) 12 (17.4%) 28 (13.4%)
JAK
40 (58.0%) 47 (66.2%) 39 (56.5%) 126 (60.3%)
mut 5 (7.2%) 7 (9.9%) 14 (20.3%) 26 (12.4%)
wt 24 (34.8%) 17 (23.9%) 16 (23.2%) 57 (27.3%)

Figure 1, and Figure 2 below show the patterns in our dataset. Make sure that all the details in your plots will be legible when printed (legend text, axis text, and any labels)

Great figure!

Figure 1: Great figure!

Amazing, right?!

Figure 2: Amazing, right?!

You can even make your plots interactive for the HTML version of the poster. You can use the HTML poster for the presentation session, and the PDF poster will be printed - so be sure the static version looks okay.

Figure 3: Amazing, right?!

Next Project Steps

We plan to conduct further analysis using:

  • Variable discombobulation 1
  • Expand our minds with explosive machine learning 2.

We will use the plasticanalysis package for this.

GitHub

The code and datasets for this project can be viewed at our GitHub repository here: https://github.com/

References


  1. Massey et al. 2005 doi: 15.36.413↩︎

  2. Smith et al. 1991 doi: 12.36.486↩︎